Vision-based Detection of Acoustic Timed Events: a Case Study on Clarinet Note Onsets

نویسندگان

  • Alessio Bazzica
  • J. C. van Gemert
  • Cynthia C. S. Liem
  • Alan Hanjalic
چکیده

Acoustic events often have a visual counterpart. Knowledge of visual information can aid the understanding of complex auditory scenes, even when only a stereo mixdown is available in the audio domain, e.g., identifying which musicians are playing in large musical ensembles. In this paper, we consider a vision-based approach to note onset detection. As a case study we focus on challenging, real-world clarinetist videos and carry out preliminary experiments on a 3D convolutional neural network based on multiple streams and purposely avoiding temporal pooling. We release an audiovisual dataset with 4.5 hours of clarinetist videos together with cleaned annotations which include about 36,000 onsets and the coordinates for a number of salient points and regions of interest. By performing several training trials on our dataset, we learned that the problem is challenging. We found that the CNN model is highly sensitive to the optimization algorithm and hyper-parameters, and that treating the problem as binary classification may prevent the joint optimization of precision and recall. To encourage further research, we publicly share our dataset, annotations and all models and detail which issues we came across during our preliminary experiments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cluster Analysis of Acoustic Emission Signals for Carbon/Epoxy Composite in Four-point Bending Test (RESEARCH NOTE)

Due to the extensive use of composites in various industries and the fact that defects reduce ultimate strength and efficiency during operation, detection of failures in composite parts is very important. The aim of this paper is to use Acoustic Emission (AE) non-destructive method in four-point bending test of carbon/epoxy composite to analyze and examine the failure mechanisms. This method is...

متن کامل

Automatic Transcription of Ornamented Irish Traditional Flute Music Using Hidden Markov Models

This paper presents an automatic system for note transcription of Irish traditional flute music containing ornamentation. This is a challenging problem due to the soft nature of onsets and short durations of ornaments. The proposed automatic transcription system is based on hidden Markov models, with separate models being built for notes and for single-note ornaments. Mel-frequency cepstral coe...

متن کامل

Automatic detection of manner events based on temporal parameters

In this study, we investigated how well acoustic events extracted from a cross-spectral temporal measure could be used to classify the manner and voicing of consonants. In particular, we developed seven measures that look at the strength and time difference between various onsets and offsets of acoustic energy. Consistent with findings by Shannon et al. (1995), our classification results show t...

متن کامل

Detection of distinct sound events in acoustic signals

The term onset detection is used to refer to the detection of the beginnings of discrete events in acoustic signals. A percept of an onset is caused by a noticeable change in the intensity, pitch or timbre of the sound. A fundamental problem in the design of an onset detection system is distinguishing genuine onsets from gradual changes and modulations that take place during the ringing of a so...

متن کامل

Acoustic detection of apple mealiness based on support vector machine

Mealiness degrades the quality of apples and plays an important role in fruit market. Therefore, the use of reliable and rapid sensing techniques for nondestructive measurement and sorting of fruits is necessary. In this study, the potential of acoustic signals of rolling apples on an inclined plate as a new technique for nondestructive detection of Red Delicious apple mealiness was investigate...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1706.09556  شماره 

صفحات  -

تاریخ انتشار 2017